Sructural Matching of Parallel Texts

نویسندگان

  • Yuji Matsumoto
  • Hiroyuki Ishimoto
  • Takehito Utsuro
چکیده

This paper describes a method for finding struc-rural matching between parallel sentences of two languages, (such as Japanese and English). Parallel sentences are analyzed based on unification grammars, and structural matching is performed by making use of a similarity measure of word pairs in the two languages. Syntactic ambiguities are resolved simultaneously in the matching process. The results serve as a. useful source for extracting linguistic a.nd lexical knowledge.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

How to Facilitate the Proof of Theorems by Using the Induction-matching, and by Generalization

In this paper, we show how we conceive the proof of theorems by sructural induction Our aim is to facilitate the proof of the theorems which can lead, in a context of automatic theorem proving, to very lengthy (or even impossible) proofs We use a very simple tool, the i-matching or inductionmatching, which allows us, on the one hand to define an original procedure of generalization, and on the ...

متن کامل

A Method to Overcome Computer Word Size Limitation in Bit-Parallel Pattern Matching

The performance of the pattern matching algorithms based on bit-parallelism degrades when the input pattern length exceeds the computer word size. Although several divide-and-conquer methods have been proposed to overcome that limitation, the resulting schemes are not that much efficient and hard to implement. This study introduces a new fast bit-parallel pattern matching algorithm that is capa...

متن کامل

Aligning Noisy Parallel Corpora Across Language Groups : Word Pair Feature Matching by Dynamic Time Warping

We propose a new algorithm, DK-vec, for aligning pairs of Asian/Indo-European noisy parallel texts without sentence boundaries. The algorithm uses frequency, position and recency information as features for pattern matching. Dynamic Time Warping is used as the matching technique between word pairs. This algorithm produces a small bilingual lexicon which provides anchor points for alignment.

متن کامل

Real-Time Identification of Parallel Texts

Parallel texts are documents that present parallel translations. This paper describes a simple method that can be deployed on a real-time news feed to create an infinitely growing source of parallel texts in French and English. Our experiment was lead on the Canada Newswire news feed. Given some of its intrinsic properties, it was possible to deploy a relatively simple text matching techniques ...

متن کامل

Parallel Overlap and Similarity Detection in Semi-Structured Document Collections

Proliferation of digital libraries plus high availability of electronic documents from the Internet have created new challenges for computer science researchers and professionals. This paper discusses the problems of using parallel and cluster computing systems for detecting plagiarism in large collections of semi-structured electronic texts, including software written in formal languages at on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993